THE DETERMINACY OF INFINITE GAMES WITH 
EVENTUAL PERFECT MONITORING 



ERAN SHMAYA 

Abstract. An infinite two- player zero-sum game with a Borel 
winning set, in which the opponent's actions are monitored eventu- 
ally but not necessarily immediately after they are played, admits 
a value. The proof relies on a representation of the game as a sto- 
chastic game with perfect information, in which Nature operates 
as a delegate for the players and performs the randomizations for 
them. 



1. Setup 

Consider an infinite two-player zero-sum game that is given by a 
triple (y4, (Pn)^gf^ , W) where A is a finite set of actions, Pn is a par- 
tition of for every n G N, and W C is a Borel set, the winning 
set of player 1. The game is played in stages: Player 1 chooses an 
action G A; then player 2 chooses an action ai G A; then player 1 
chooses an action 02 ^ A, and so on, ad infinitum. Before choosing a„, 
the player who plays at stage n receives some information about his 
opponent's actions at previous stages: Let h = (oq, ai, . . . , a„_i) be the 
finite history that consists of the actions played before stage n; then 
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before choosing a„, the player who plays at stage n observes the atom 
of Pn that contains h. Player 1 wins the game if the infinite history 
(oo, Oi, . . . ) is in W. When the action set and information partitions 
are fixed, I denote the game by T{W). 

A behavioral strategy x = (x„)„gn of player 1 is a sequence {xn 
Pn A(>l)}„=o,2,4,... of functions: At stage n, after observing the finite 
history h — (ao, ai, . . . , a„_i), player 1 randomizes his action according 
to Xn{nn{h)), where 7r„(/i) is the atom of Pn that contains h. Abusing 
notations, I sometimes write Xn{h) instead of Xn{T^n{^))- Behavioral 
strategies y of player 2 are defined analogously. 

Every pair x, y of strategies induces a probability distribution jj,x,y 
over the set A? of infinite histories or plays: fix,y is the joint distribution 
of a sequence cto, cti, of A- valued random variables such that 



I call such a sequence of random variables an {x, y)-random play. If the 

players play according to the strategy profile {x,y), then the expected 
payoff for player 1 is given by 




)[a], if n is odd. 



)[a], if n is even. 



(2) 



fx,,y{W)^F{{ao,a,,...)eW), 



where ao, ai, ... is an {x, y)-random play. 
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The lower value val r(iy) and upper value val T{W) of the game 
r(H^) are defined by: 

val r{W) = supinf iixy{W), and val r(H^) = inf sup /i^ y{W) , 
X y ' y X ' 

where the suprema are taken over all strategies x of player 1 and the 
infima over all strategies y of player 2. The game is determined if the 
lower and upper values are equal, val r(l^) = val r(l^), in which 
case their common value is called the value of the game. For e > 0, 
a strategy x of player 1 is e-optimal if ijLx,y{W) > val r(l^) — e for 
every strategy y of player 2. We also say that player 1 can guarantee 
payoff of at least val r(l^) — e by playing such a strategy x. e-optimal 
strategies of player 2 are defined analogously. 

Let ~„ be the equivalence relation over infinite histories such that 
u ~„ u' whenever m|„ and m'|„ belong to the same atom of P„, where 
u\n and u'\n are the initial segments of u and u' of length n. The inter- 
pretation is that if u, u' e ^4^ and u ~„ u' , then at stage n the player 
cannot distinguish between u and u' . Say that at stage n the player 
observes the action of stage m if, for every pair of infinite histories 
u — {qq, Qi, . . .) and u' = (oq, a[, . . .), u u' implies = a^. 

1.1. Definition. The information partitions (P„)n>o satisfy perfect re- 
call if the following conditions are satisfied: 

(1) Players know their own actions: at stage n the player observes 
the action of stage n — 2. 

(2) Players do not forget information: \l u,u' e and u ~n+2 u' 
then u ~„ u' . 
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The setup of infinite games with perfect recall is general enough to 
subsume two special cases which have been extensively studied: 
Borel games. If, at every stage n, players observe previous actions 
of their opponents, then the game is called a Borel game or a game 
with perfect information. Gale and Stewart [3J proved that such games 
are determined if the winning set W is closed. In a seminal paper, 
Martin [7j proved that the game is determined for every Borel winning 
set W. Borel games admit pure 0-optimal strategies, and the value is 
or 1. Moreover, Borel games with an infinite action set A are also 
determined. 

Blackwell games. Assume that at even stages n = 2k, player 1 
observes the actions of stages 0,1, ... ,2k — 1, and at odd stages n = 
2k + 1, player 2 observes the actions of stages 0,1, ... ,2k — 1 (his own 
actions and all the previous actions of his opponent except for the last 
one), and that the information partitions are the roughest partitions 
that satisfy these conditions. This means essentially that the players 
play simultaneously at stages 2k and 2k + 1 for every G N, and 
then both actions are announced. Such games are called Blackwell 
games. Blackwell [H, [2] proved the determinacy of Blackwell games 
(which he called "infinite games with imperfect information") with a 
Gs winning set, and conjectured that every Blackwell game with a 
Borel winning set is determined. Vervoort [llj advanced higher in the 
Borel hierarchy, proving determinacy of games with Gsa winning sets. 
Blackwell's conjecture was proved by Martin in 1998 [S]. 
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Borel games and Blackwcll games differ in the timing of monitoring 
- the observation of the opponent's actions: whereas in Borel games 
monitoring is immediate, in Blackwell games player 2's monitoring is 
delayed by one stage. Both setups satisfy a property that I call eventual 
perfect monitoring. This means that the entire history of the game is 
known to every player at infinity. One example of eventual perfect 
monitoring, of which Blackwell games are a special case, is delayed 
monitoring, when the action of stage m is monitored after some delay 
dm- But the setup of games with eventual perfect monitoring is more 
general than the setup of games with delayed monitoring. First, the 
former setup allows the length of the delay to depend on the history of 
the games. Second, it allows the information to be revealed in pieces; 
for example, a player can observe some function of the previous actions 
of his opponent before he observes the actions themselves. 

1.2. Definition. The information partitions {Pn}nen satisfy eventual 
perfect monitoring if for every u, u' e ^4^^ such that u ^ u', there exist 
an even n such that u u' and an odd n such that u o^^n u' . 

The purpose of this paper is to prove the following theorem. 

1.3. Theorem. Let V — {A, (Pri)n>0) be an infinite game with a 

finite action set, a Borel winning set, perfect recall, and eventual perfect 
monitoring. Then T is determined. 

The proof of the theorem relies heavily on the stochastic extension 
of Martin's theorem about the determinacy of Blackwell games. How- 
ever, except for the simple case in which the stages are divided into 
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blocks and previous actions are monitored at the end of each block, 
I was unable to find an immediate reduction of the eventual perfect 
monitoring setup to the Blackwell games setup, nor was able to adapt 
Martin's proof to the eventual perfect monitoring setup. 

Infinite games with Borel winning sets have recently been applied in 
economics literature on testing the quality of probabilistic predictions. 
Consider a forecaster who claims to know the probability distribution 
that governs some stochastic process. To prove his claim, the forecaster 
provides probabilistic predictions about the process. An inspector tests 
the forecaster's reliability using the infinite sequence of predictions pro- 
vided by the forecaster and the observed realization of the process. 
Using Martin's Theorem about the determinacy of Blackwell games, 
I proved [9] that any inspection which is based on predictions about 
the next-day realization of the process is manipulable, i.e., it can be 
strategically passed by a charlatan. Theorem 11.31 can be used to prove 
that tests based on predictions about an arbitrarily long finite horizon 
are also manipulable [21 Section 5]. 

In Section[2]I give some examples of games with and without eventual 
perfect monitoring. In Section [3] I prove the determinacy of infinite 
games with perfect recall and a compact winning set; this result is 
used in the proof of Theorem II. 3[ The proof of the theorem is in 
Section HI Martin's Theorem is reviewed in the appendix. 

2. Examples 

All the examples in this section have the same action set and the 
same winning set. The action set is A = {S,L}. At every stage. 
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each player decides whether to Stay or Leave the game. Once a player 
leaves, his future actions do not affect the outcome of the game. For an 
infinite history u = (oq, fli, . . . ), let n^{u) = min {n even |a„ = L} be 
the (possibly infinite) first stage in which player 1 left the game, and 
let n'^{u) be the first stage in which player 2 left. Let 

W = E I (n^(-u) < n^{u) < oo) or (n^('u) < oo and n^(-u) = oo) } 

be the winning set of player 1. So player 1 wins if he leaves the game 
after player 2 leaves, or if he leaves the game at some point and player 
2 never leaves. In Example 12.11 both players have eventual perfect 
monitoring. In Example 12.21 none of the players has eventual perfect 
monitoring. In Example 12.31 only player 1 has eventual perfect moni- 
toring. 

2.L Example. Let k he a. positive integer. Assume that at stage n 
each player observes his own actions and the actions of his opponent 
at stages smaller than n — k. Then the value of the game is 0. An 
optimal strategy for player 2 is to play S as long as he is not informed 
that player 1 has played L. When player 2 knows that player 1 played 
L at some point, player 2 then plays L. 

Note that in the previous example, the number k need not be con- 
stant. It can depend on the stage number, and can differ between the 
players. As long as player 2 knows the actions of his opponent even- 
tually, the game is determined and the value is 0. (The fact that the 
value in this example does not depend on the information partitions is 
not typical.) 
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2.2. Example. Assume that each player knows his own previous ac- 
tions, but does not observe his opponent's actions. Then the game is 
not determined. In fact, val F = and val F = 1. 

2.3. Example. Assume that player 1 observes the past actions of player 
2, but player 2 doesn't observe the past actions of player 1. Then the 
game is not determined. In fact, yal F = 1/2 and val F = 1. An optimal 
strategy for player 1 is: At stage play L or S with probability 1/2, 
and, at stage 2k for k > 1, play the action of player 2 from stage 2k — 1. 



The set A of plays is naturally endowed with the product topology. 
In this section I prove the special case of Theorem 11.31 for compact 
winning sets. The determinacy follows from perfect recall alone, even 
without eventual perfect monitoring. The proof relies on two stan- 
dard results from game theory: the Minimax Theorem for normal form 
games and Kuhn's Theorem. 

Recall that a normal form game is given by a triple (S, B, R) where 
E and are Borel spaces of pure strategies for players 1 and 2, and 
i? : S X ^ [0, 1] is the payoff function. A mixed strategy ^ of player 
1 is a probability distribution over S. Mixed strategies r of player 2 
are defined analogously. Say that the mixed extension of the normal 
form game (S, G, r) is determined if 



3. Games with a compact winning set 
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The Minimax Theorem [10, Proposition A. 10] states that if S is a com- 
pact topological space and the function R{-,6) is upper semi continuous 
for every 6 E Q, then the mixed extension of the normal form game 
(S, 9, R) is determined. 

Let r = {A, {Pn}n£N, W) be an infinite game with perfect recall. 
The normal form of T is the normal form game A^(r) = (S, B, R) 
defined as follows. A pure strategy a G S of player 1 is a sequence 
{cr„ : Pn A}n even of fuuctious: at stagc n, after the finite history h = 
(/lo, hi, ... , hn-i) was played, player 1 plays o"„(7r„(/i)), where vr„(/i) is 
the atom of P„ that contains h. Pure strategies 9 of player 2 are 
defined analogously. Every pair cr, 6 of pure strategies of players 1 and 
2 determines an infinite history n(cr, 9) = (oq, Oi, . . . ) that is given by 



The payoff function of A^(r) is R{(T,9) = 1w{u{(T,9)). Kuhn's Theo- 
rem [ini Theorem D.l] states the equivalence between mixed strategies 
and behavioral strategies in games with perfect recall. In particular, 
the game F is determined if and only if its normal form game A^(r) is 
determined. 

3.1. Lemma. An infinite game with a finite action set, perfect recall, 
and a compact winning set is determined. 

Proof. Let F = {A, {P„}„eN, W) be an infinite game with a finite ac- 
tion set A, perfect recall, and a compact winning set W. By Kuhn's 





9n (tt™ (ao, . . . , a„_i)) , for odd n. 
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Theorem it is sufficient to prove that the normal form game A^(r) = 
(S, 6, R) of r is determined. This follows from the minimax theo- 
rem. Indeed, the set S of pure strategies of player 1 is compact in the 
product topology, and the payoff function R is upper semicontinuous 
as a composition of the continuous function {a, 9) ^ u{a, 9) and the 
function which is upper semi continuous because W is closed. □ 



4. Proof of Theorem 11.31 

Overview of the proof. Roughly speaking, I am going to construct 
a stochastic game F* with perfect information that mimics the original 
game F. In F*, at every stage m, the player announces a mixture 
over A contingent on his information at that stage. So in F*, instead of 
choosing an action which is not revealed his the opponent (as in F), the 
player announces how he intends to randomize his action. The actual 
randomization is performed by Nature at a future stage k{m), in which 
the opponent should have observed the m-stage action in F, and the 
realization of that randomization is immediately made public. So in 
the game F*, Nature performs the randomization for the player. By 
Martin's Theorem the game F* is determined, and I prove that the value 
of F* is also the value of the original game F. For this purpose I have to 
show that the fact that in F* the player announces his randomization 
plan cannot be used by the opponent to change the payoff in the game. 
This step, which is the core of the proof, uses approximations of the 
winning set by compact sets, and the fact that by Lemma 13.11 the 
original game F is determined when the winning set is compact. 
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Since the sets of actions must be finite for Martin's Theorem to apply, 
I first prove that every behavioral strategy in F can be approximated 
by a behavioral strategy in which all the mixtures are taken from some 
finite sets. This is done in Lemma I4.2[ Because of the approximation 
argument, the stochastic game F* that is constructed in the proof de- 
pends on an additional parameter e which corresponds to the level of 
approximation. 

Preliminaries. Let A^'^ = |J„gpj be the set of finite histories of 
the game. For a finite history h G A", the length of h is given by 
length(/;,) = n. For an infinite history u = (ag, cti, (22; • • • ) ^ A^ and 
n G N, let u\n = (oo, . . . , a„_i) G A^'^ be the initial segment of u of 
length n. Similarly, for a finite history h G A^^ and n < length(/i), let 
h\nhe the initial segment of h of length n. 

Eventual perfect monitoring entails that the action of stage m is 
known to the opponent at infinity. Lemma 14.11 below shows that in 
fact more is true: for every m there exists some finite stage n > m at 
which the opponent knows the action of stage m. 

4.1. Lemma. If the game admits eventual perfect monitoring, then for 
every m G N there exists an n > m such that n ^ m mod 2 and 
such that at stage n the opponent observes the action of stage m, i.e., 
for every pair u = (oq, Oi, . . . ), m' = (oq, a'^, . . . ) of infinite histories, 
u u' implies am = aim- 
Proof. This is an application of Konig's Lemma. Assume without loss 
of generality that m is odd. Let a G A, and let Ca = {u = {uq, Ui, . . .) G 
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^^|W'm = Q-}- Then Ca and are compact. Let Ta C A"^^ be the set 
of all finite histories h of even length n such that 7r~^(/i) n Cq 7^ and 
7r~^(/i) n 7^ 0, where 7r„(/i) is the atom of P„ that contains /i. 

It follows from the perfect recall assumption that Ta is a tree over 
A^. I claim that Ta is well-founded. Indeed, if v is an infinite branch of 



T, then fj^ even ^(^In) n C„ and n„ even ^n\'"\n) ^ Q are nonempty 



as the intersections of decreasing sequences of compact sets. Let u — 

{Uo, Ui,...) e Cln even T^n\v\n)nCa and u' = {u'q, u[, . . .) E H,, even ^(^U 

C^. Then = a ^ u'^^ and therefore u ^ u', but u ~„ u' for every 
even n, in contradiction to the eventual perfect monitoring assumption. 

By Konig's Lemma, Ta is finite. Let be the maximal length of 
elements of Ta, and let n = max{n"|a £ 74} + 2. Then at stage n player 
1 observes the action of stage m. □ 

For two strategies x,x' of player 1, let d{x,x'), the distance between 
X and x', be given by 



where the maximum is taken over all atoms p of P„. The distance 
d{y, y') between two behavioral strategies y, y' of player 2 is defined 
analogously. 

4.2. Lemma. Let x, x' be strategies of player 1 and y, y' be strategies 
of player 2. Then 




n even 



Wll < {d{x,x') + d{y,y'))/2 
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for every Borel subset W of . 
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Proof. The idea is to join a (a;, ?/)-random play and a (a;', ?/')-random 
play such that the two random plays are equal with high probability. 
Let Zn '■ Pn ^ A he given by z„ = Xn for even n's and z„ = ?/„ for odd 
n's and z'^ : Pn ^ A he given by z'^ = x'^ for even n's and 2;^ = 
for odd n's. Let ao, «05 ■ ■ ■ be a sequence of A-valued random 

variables defined inductively such that the conditional joint distribution 
of the pair given the event {a^ = ai,a[ = a'^ioiQ < i < n} 

satisfies 

(3) 

P(a„ = a \ai = ai,a[ = a[ for < i < n) = Zn{ao, . . . ,a„_i)[a], 
(4) 

P (a^ = a \ai = ai, a[ = a[ for < i < n) = z'^^^a^, . . . , a^_i) [a'], and 
(5) 

P(a^ ^ an \ai = ai,a[ = a[ioi Q < i < n) < p„(ao, . . . ,an-i) - 4(«0) • • • ^ an-i)lli/2, 

for every n and every Oq, Cq, . . . , a„_i, a'^_i G A. The existence of ran- 
dom variables a„, a'^ with the prescribed conditional distribution fol- 
lows from a standard coupling argument [H Theorem 5.2]. From ([3]) it 
follows that 

P (a„ = a |aj = for < 2 < n) = z„(ao, . . . , a„_i)[a] 

for every n and every oq, . . . , ctn-i G A, i.e., that cto, «!, • • • is an (x, y)- 
random play of r(iy). Similarly, from (j4]) it follows that Oq, a'^, . . . is 
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a (x', y') -random play of r(l^). From ([5]) it follows that 

P (a„ 7^ a'^ \ai = a[ ioi Q < i < n) < max ||-2n(p) — -2n(p)lli/2- 
Therefore, 

P (a„ 7^ for some n) < P (a„ 7^ la^ = a- for < i < n) 

< Vmax||z„(p) -<(p)||i/2 = {d{x,x') + d{y,y'))/2. 

n 

The assertion follows from the last inequality and the fact that yU^. ^ and 
lix',y' are the distributions of cti, • • • and ct'^, . . . , respectively. □ 

4.3. Corollary. Let Ae,„ he a finite set which is ejT^ -dense in IS.{^A) 
endowed with i.e., such that the e/T^-halls around elements of A^^n 
cover A(A). Then there exists an e-optimal strategy y for player 2 in 
T{W) such that ynip) £ ^^^^(yl) for every odd n and every atom p of 

Pn. 

Proof. Let y' be an e/2-optimal strategy of player 2 in r(iy) and let y 
be a strategy of player 2 such that \\yn{p) ~yn{p)\\i < e/2" and G 
Ae,„(A) for every odd n and every atom p of P„. Then d{y,y') < e, 
and therefore, 

> l^x,y'iW) - e/2 > ^\T{W) - e 



for every strategy x of player 1, where the first inequality follows from 
Lemma 14.21 and the second inequality from the fact that y' is e/2- 
optimal. Therefore y is e-optimal. □ 
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Nature as the players' randomization delegate. Let T = (A, Pn, W) 

be an infinite game witli perfect recall and eventual perfect monitoring. 
In this section, I define an auxiliary stochastic game F* = r*(14^) with 
perfect information, which mimics the original games F. 

Fix e > and, for every n eN, let Ae,n be a finite set which is e/2"'- 
dense in A{A) endowed with i.e., such that the e/2"-balls around 
elements of A^^n cover A (A). For every m G N, fix k{m) > m such 
that m 7^ k{m) mod 2, and such that at stage k{m) the opponent 
observes the action of stage m, as in Lemma [4. 1[ 

For every n, let i?„ = {b : Pn Ae.„} be the set of actions of 
stage n in r*{W), so that an action is a function from P„ (viewed as 
a collection of atoms) to A^ „; and let Sn = A^" be the set of states 
of stage n in T*(W), where Kn = {m\k{m) = n}. Let /„ : A" — Sn 
be the projection over the corresponding coordinates m G Kn, and let 
F : ^0 X ^1 >A^^he such that 

(6) F(/o(m|o),/i(m|i),...)=m 

for every u G A^. 

T*(W) is played as follows: Player 1 plays at even stages and player 
2 at odd stages. At every stage n. Nature announces a state s„ in Sn, 
and then the player that play at that stage announces an action 6„ 
in Bn- Nature chooses the state Sn of stage n from the distribution 
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z (so, &0, • • • , -Sn-i, 6n-i) that is given by 

(7) ^(So,6o,---,Sn-l,^'n-l) M = 

P(/n ("0, • • • ,an-i) = s|/fc(ao,...afc-i) = Sfc for < /c < n - 1) , 
where cio, • • • , cin is a sequence of A- valued random variables such that 

(8) P(afc = a |ao, • • • ,afc-i) = (7rfc(ao, • • . ,afc-i))[a], 

where 7rfc(/i) is the atom of that contains /i for every h E A''. Player 
1 wins the game if F (sq, Si, . . .) e W. 

A pure strategy of player 1 in r*(l^) is a sequence {x* : Sq x Bq x 
■ ■ ■ X Sn-i X Bn-i X 5'„ — > Bn}n=o,2,... of functions: at stage n, after 
observing the finite history {so,bo, . . . ,Sn-i,bn-i,Sn), player 1 plays 
X* {sq, bo, ... , Sn-i, bn-1, Purc strategies y* of Player 2 are defined 
analogously. Let X* and Y* be the sets of pure strategies of players 1 
and 2 respectively. The expected payoff for player 1 in the game r*(l^) 
when the players play according to {x*,y*) is given by R{x*,y*) — 
P (-F(Co, Ci) ■ ■ ■ ) £ ^) where /3o, Co, Pi: Ci, ■ ■ ■ is a sequence of random 
variables, where the values of /3„ are in B^ and the values of C„ are in 
Sn such that 

P (Cn = ICo; Po-> - ■ ■ ■> Cn-1, /^n-l ) = Z (Co, Po, ■ ■ ■ , Cn-1, Pn-l) [s] , 

= < (Co, /3o, ■ ■ ■ , Cn-1, /?n-i, Cn) for cvcu 71, and 

Pn = y*n (Co, /?0, • • • , Cn-1, Pn-U Cn) for odd U. 
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I call such a sequence ^o, Po, Ci; Pi, ■ ■ ■ of random variables an (x*, y*)- 
random play of T*{W). 

Identifying the game r*(l^) with its normal form, say that r*(l^) 
is determined if 

sup inf / _R(x*, ?/*)^(dx*) = inf sup / R{x* ,y*)T{dy*). 

In this case the common value of the two sides of the last equations is 
called the value of the game, and is denoted by val T*(W). 

4.4. Lemma. Let W C be a Borel set. Then the game Tl{W) is 
determined, and val T*{Wq) > val r*{W) — e for some compact subset 
Wo ofW. 

Proof. In the terminology of appendix |Al the game r*(W^) is the sto- 
chastic game with stochastic setup S = {{Sn, Bn)nm: and the win- 
ning set ri~^{W), where rj : Sq x Bq x Si x Bi x ■■■ ^ is the 
continuous map given by 

(9) T] (so, 6o, Si, 6i, . . . ) = F(so, Si, • • • )• 

Thus ri~^{W) is a Borel set and therefore by Prop osit ion I A . 1 1 the game 
{S,ri~^{W)) is determined. Moreover, there exists a compact set C C 
So X Bo X Si X Bi X ... such that C C ri~\W) and val(5, C) > 
val{S,ri~^(W)) — e. Let Wo = TjiC). Then Wo is a compact subset 
of W and val(5,r/-i(W^o)) > val(5, C) > val(5, r/-i(l^)) - e, since 
ri~^{Wo) ^ C. The assertion follows from the fact that the games 
(5, r]-^{Wo)) and (5, r]-^{W)) are Tl{Wo) and Tl{W), respectively. □ 
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The following lemma says that, up to e, player 2 can guarantee in 
r* the same amount he can guarentee in F. Intuitively, when player 
2 computes the upper value of the game, he assumes that player 1 is 
going to play the best response to player 2's strategy; so the fact that 
in r*(Vr) player 2 has to declare his contingent mixed action does not 
reduce the upper value of the game. 



4.5. Lemma. For every Borel set W of JV , 

val Vl{W)-e<mlV{W). 

Proof. Note first that by definition of Kn and from the perfect recall 
assumption, there exist functions gn,k '■ Pn ~^ Sk for every n and every 
k < n such that 

(10) gn,k{'^n{ao, = fk{ao, . . . , ak-i) 

for every h = (cq, . . . , a^-i) G A" and where 71^ : A"' ^ P„ is the 
natural projection. 

Let y be an e-optimal behavioral strategy for player 2 in r(iy) such 
that yn{p) G ^e,n{A.) for every odd n and every atom p of P„. The ex- 
istence of such a strategy y follows from Corollary 14.31 Consider a pure 
strategy y* of player 2 in T*{W) that is given by y*{so, bo,..., s„_i, s„) = 
yn for every odd n and every partial history (sq, Bq, . . . , s„_i, bn-i, s„) of 
T*(W). (Thus, in every odd stage n, player 2's action is ?/„, regardless 
of the history.) Let x* be any strategy of player 1 in F*. Let x be the 
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behavioral strategy of player 1 in T(W) that is given by 



Xn{p) = X*(so,6o, • • • , 



Sn-l,bn-l,Sn){p) 



where (sq, bo, ... , s„_i, bn-i, s„) is the finite history of T*{W) defined 
inductively by bk = xl (sq, &o, • • • , Sfc-i, bk-i, Sk) for even k, bk = yk for 
odd k, and Sk = gn,k{p)- 

I am going to join an (x, y)-random play of r(W^) and an {x*,y*)- 
random play of r*(l^) with equal payoffs. Let Co, Po, «o, Ci; A, cti; • • • 
be a sequence of random variables such that the values of n„ are in P„, 
the values of Cn are in S'„, the values of are in and the values of 
a„ are in A, and such that 

(11) n„ = 7r„ (tto, • • • , ttn-l) , 

(12) Cn = fn (ao, • • • , ttn-l) , 

(13) /3„ = (Co, /3o, • • • , Cn-uPn-u Cn) for evcu n, 

(14) = Vn (Co, /3o, • • • , Cn-1, /^n-i, Cn) for odd n, and 

(15) F {an = a\ao, . . . ,an-i) = Pnijin) . 

From dll]) and it follows that 

(16) Cfc = 9n,k^n 

for every n and every k < n. From f|T^ and the definition of y*, it 
follows that /3n = y-n for every odd n. In particular, 



(17) 



y„(n„) = PnO^n) 
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for every odd n. From ( |T3l) . the definition of x, the fact that (3k 
for every odd fc, and flTB]) . it follows that 



= Vk 



(18) 



Xn (n„) = [3n (n„) 



for every even n. From ( fT5l) . ( |T71) . ( |T8|) . and ( ITTl) . it follows that 



i.e., that ao, ai, • • • is an (x, t/)-random play of r(l^). 
From (in]), ([13]), ([111), ((15!), and follows that 

(19) P(Cn, = S„|Co, /5o, • • • , Cn-1; (^n-l) = 2: (Co; /^O; • • • ; Cn-l; /^n-l) • 

Indeed, given the event {Co = So,/5o = 60, = = 

the conditional distribution of ao, . . . , a„ is like the conditional 
distribution of a sequence oiq, ■ ■ ■ ,an that satisfies ([S]) given that fk{o:o) = 
Sk for k < n. (Here I use the fact that /3„ is measurable with respect 
to Co, • • -Xn-) 

From (II9]),([I3D, and it follows that Co, Po, Ci, A, • • • is an (x*, y*)- 
random play of r^{W). Therefore, the expected payoff for player 1 in 



T*{W) under {x*,y*) is 

P (F(Co, Cu...)eW) = F ((ao, a,,...)eW)= fi,,y{W) < ^ r(W^)+e, 



where the first equality follows from and f[T2|) . the second equality 
from ([2]), and the inequality from the fact that y is e-optimal. 
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Summing up, I have provided a pure strategy y* of player 2 in 
r*(iy) (namely, play ?/i,?/3, . . . ) that gives expected payoff of at most 
val r(14^) + e against any pure strategy x* of player 1 in Tl{W). There- 
fore, val r:(Vr) < ^ 1(1^) + e. □ 

Proof of Theorem II. 3L Consider the stochastic game r*(iy) defined 
above. Let Wq be a compact subset of W such that val r*(Wo) > 
val r*(iy) — e, and whose existence follows from Lemma [4.41 Then 

val V{W) > yal T{Wo) = ^ T{Wo) > val T*{Wo)-e > val T*{W)-2e, 

where the first inequality follows from the fact that W ^ Wo, the first 
equality follows from Lemma I3.H the second inequality follows from 
Lemma [4. 5i and the third inequality follows from the choice of Wq. 

Similarly, for player 2 we get val T{W) < val T*{W) + 2e. It follows 
that val r(iy) < yal r(14^) + 4e. Since e was arbitrary, it follows that 

toI r(vr) =yai r(vr). □ 

Appendix A. Martin's Theorem for Stochastic Games 

In this section I formulate Martin's Theorem about the determinacy 
of stochastic games. The stochastic game used in this paper has com- 
plete information, while Martin studied a more general setup in which 
the players play simultaneously. Note, however, that Martin's older 
theorem about the determinacy of Borel games [7] is not sufficient for 
my purposes because of the presence of Nature. 

A stochastic game with perfect information is given by ((5'„, z, V), 

where Bq,Bi, . . . are finite sets of actions, Sq, Si, . . . are finite sets of 
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states or Nature's actions, z = {zn : So x Bq x ■ ■ ■ x Sn-i x Bn-i 
A{Sn)} is Nature's strategy, and V Sq x Bq x Si x Bi x . . . is the 
winning set of Player 1. 

The game is played as follows: Player 1 plays at even stages and 
player 2 at odd stages. At every stage n. Nature announces a state s„ 
in Sn, and then the player that play at that stage announces an action 
bn in Bn- Nature chooses the state s„ of stage n from the distribution 
z{so, bo,. . . , Sn-i, bn-i). Player 1 wins the game if (sq, bo, si,bi, . . .) e 
V. 

I call a triple S = ((S'„, -B„)„gN, ^, ) of action sets, states sets, and 
Nature's strategy a stochastic setup. So the stochastic games that I use 
in this paper are given by a stochastic setup S = {{Sn, Bn)neN, z}) and 
a winning set V C So x Bq x Si x Bi x . . . . 

The definitions of strategies of the players, and of determinacy and 
value of the game, are omitted. Note that since this is a game with 
perfect information (i.e., before a player chooses the action 6„ of stage 
n, he observes the finite history of the game (sq, bo, ... , Sn-i, bn-i, s„) 
up to that stage), Kuhn's Theorem [TDl Theorem D.l] applies, so that 
behavioral strategies and mixed strategies are equivalent. 

The following proposition was proved by Martin \8\. For the sto- 
chastic extension, see Maitra and Sudderth's paper [6]. The fact that 
the lower value of the game can be approximated by the value on some 
compact subset was proved earlier by Maitra et al. using Choquet's 
Capacity Theorem. 
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A.l. Proposition. LetS = {{Sn, Bn)neN, z}) be a stochastic setup, and 
let V be a Borel subset of Sq x Bq x Si x Bi x . . . . Then: 

(1) The game {S,V) is determined. 

(2) For every e > 0, there exists a compact subset CofV such that 

val{S,C) > val{S,V)-€. 
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